From Words To Tokens: The Byte-Pair Encoding Algorithm